Fix sortby descending order placing NaNs at beginning instead of end#11118
Fix sortby descending order placing NaNs at beginning instead of end#11118jsignell merged 2 commits intopydata:mainfrom
Conversation
|
Thank you for opening this pull request! It may take us a few days to respond here, so thank you for being patient. |
jsignell
left a comment
There was a problem hiding this comment.
Thanks @kkollsga this seems to work although it would be good to include a multidimensional test as well. It does seem a little verbose though. I was experimenting with something like this instead:
if ascending:
indices[key] = np.lexsort(tuple(reversed(arrays)))
else:
# sort any null values to the beginning before reversing the order
indices[key] = np.lexsort(
tuple(
[*reversed(arrays), *[notnull(arr) for arr in reversed(arrays)]]
)
)[::-1]which relies on notnull from duck_array_ops
|
that does look nicer. Another though I had was counting the number of NaNs; and call |
|
Thanks @jsignell and @dcherian for the suggestions! I tested @jsignell's approach locally - it's much cleaner and handles all null types via Happy to update the PR with this implementation and add a multidimensional test case, e.g.: ds = Dataset({
"var": (["x", "y"], [[1, 2], [3, 4], [5, 6]]),
"key": ("x", [3.0, np.nan, 1.0])
})
result = ds.sortby("key", ascending=False)
# x order: [0, 2, 1] with NaN row at end |
Maybe multidimensional wasn't quite what I meant actually. I would love to see a test where there are multiple arrays passed to the >>> ds = xr.Dataset(
... {
... "A": (("x", "y"), [[1, 2, 3], [4, 5, 6]]),
... "B": (("x", "y"), [[7, 8, 9], [10, 11, 12]]),
... },
... coords={"x": ["b", "a"], "y": [np.nan, 1, 0]},
... )
>>> ds.sortby(["x", "y"], ascending=False) |
Use duck_array_ops.notnull as additional sort keys to ensure null values sort to the end in descending order. This is cleaner than the previous approach of manually tracking NaN positions. Fixes pydata#7358 Co-Authored-By: Claude <noreply@anthropic.com>
291cfb9 to
6aeb994
Compare
jsignell
left a comment
There was a problem hiding this comment.
This looks good to me! I'll plan on merging it later today unless there is more feedback.
whats-new.rstSummary
Fixed
sortby(ascending=False)placing NaN values at the beginning of results instead of the end.The issue was that descending sort simply reversed the ascending order with
order[::-1], which moved NaNs from end to beginning.The fix detects trailing NaN elements in the ascending order and only reverses the non-NaN portion, keeping NaNs at the end.
Test plan
test_sortby_descending_nansregression test🤖 Generated with Claude Code